Information Extraction
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
Boyuan Pan, Yazheng Yang, Hao Li, Zhou Zhao, Yueting Zhuang, Deng Cai, Xiaofei He
Machine comprehension (MC) has gained significant popularity over the past few years and it is a coveted goal in the field of natural language understanding. Its task is to teach the machine to understand thecontent ofagivenpassage andthenanswer arelated question, which requires deep comprehension and accurate information extraction towards the text.
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.32)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
Appendix 1 Back imagination and Back speech
Figure 1: The illustrative examples for two proposed techniques: Back-imagination and Back-speech. Tiny ImageNet [Le and Y ang, 2015] serves as a compact version of the comprehensive ImageNet dataset. The Stanford Sentiment Treebank-2 (SST -2) [Socher et al., 2013] is a sentiment classification dataset Given the scarcity of datasets for understanding natural language in visual scenes, we introduce a novel textual entailment dataset, named Textual Natural Contextual Classification (TNCC). This dataset is formulated on the foundation of Crisscrossed Captions [Parekh et al., 2020], an image In this work, we employ a uniform experimental configuration for both textual entailment and sentiment classification tasks. For the image classification task, we employ the ResNet18 [He et al., 2015] model, which is considered more suitable for small datasets.
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.57)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.55)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.55)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.35)
A Additional Results
The acronym dataset is a QA task that requires models to decode financial acronyms. The FinMA7B-full model achieved the highest ROUGE-1 score of 0.12 and the B.1 Why was the datasheet created? B.2 Has the dataset been used already? If so, where are the results so others can compare (e.g., links to published papers)? Y es, the dataset has already been used. It was employed in the FinLLM Share Task during the FinNLP-AgentScen Workshop at IJCAI 2024, known as the FinLLM Challenge.
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (5 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Banking & Finance > Trading (1.00)
- Government (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.41)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.41)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > Qatar (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (19 more...)
- Education (0.46)
- Information Technology (0.46)
- Information Technology > Data Science (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
- (4 more...)
d921c3c762b1522c475ac8fc0811bb0f-AuthorFeedback.pdf
We wish to thank all of the reviewers for their time and thorough reading of our paper! We appreciate the reviewer's suggestions regarding clarity. We have added the suggested summary sentence "the key We started with binary sentiment classification, but are actively working on more tasks. RNN hidden states onto the top two PCs for two different input sequences that differ only by two tokens (replacing ' The trajectories start out the same as the initial tokens are identical. We have added a footnote noting this in the main text.
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.37)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.37)
An Instagram data breach reportedly exposed the personal info of 17.5 million users
An Instagram data breach reportedly exposed the personal info of 17.5 million users As spotted by Malwarebytes, the alleged leak includes usernames, email addresses, phone numbers and more. If you received a bunch of password reset requests from Instagram recently, you're not alone. As reported by Malwarebytes, an antivirus software company, there was a data breach revealing the sensitive information of 17.5 million Instagram users. Malwarebytes added that the leak included Instagram usernames, physical addresses, phone numbers, email addresses and more. The company added that the data is available for sale on the dark web and can be abused by cybercriminals.
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.74)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.62)
- Information Technology > Data Science > Data Mining > Text Mining (0.50)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)
Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baseline
Existing video multi-modal sentiment analysis mainly focuses on the sentiment expression of people within the video, yet often neglects the induced sentiment of viewers while watching the videos. Induced sentiment of viewers is essential for inferring the public response to videos and has broad application in analyzing public societal sentiment, effectiveness of advertising and other areas. The micro videos and the related comments provide a rich application scenario for viewers' induced sentiment analysis. In light of this, we introduces a novel research task, Multimodal Sentiment Analysis for Comment Response of Video Induced(MSA-CRVI), aims to infer opinions and emotions according to comments response to micro video. Meanwhile, we manually annotate a dataset named Comment Sentiment toward to Micro Video (CSMV) to support this research. It is the largest video multi-modal sentiment dataset in terms of scale and video duration to our knowledge, containing 107, 267 comments and 8, 210 micro videos with a video duration of 68.83 hours. To infer the induced sentiment of comment should leverage the video content, we propose the Video Content-aware Comment Sentiment Analysis (VC-CSA) method as a baseline to address the challenges inherent in this new task. Extensive experiments demonstrate that our method is showing significant improvements over other established baselines.
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Towards Robust Multimodal Sentiment Analysis with Incomplete Data
The field of Multimodal Sentiment Analysis (MSA) has recently witnessed an emerging direction seeking to tackle the issue of data incompleteness. Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA. The proposed LNLN features a dominant modality correction (DMC) module and dominant modality based multimodal learning (DMML) module, which enhances the model's robustness across various noise scenarios by ensuring the quality of dominant modality representations. Aside from the methodical design, we perform comprehensive experiments under random data missing scenarios, utilizing diverse and meaningful settings on several popular datasets (e.g., MOSI, MOSEI, and SIMS), providing additional uniformity, transparency, and fairness compared to existing evaluations in the literature. Empirically, LNLN consistently outperforms existing baselines, demonstrating superior performance across these challenging and extensive evaluation metrics.
- Information Technology > Communications > Social Media (0.32)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.32)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.32)